78 research outputs found
Learning to Learn to Disambiguate: Meta-Learning for Few-Shot Word Sense Disambiguation
The success of deep learning methods hinges on the availability of large
training datasets annotated for the task of interest. In contrast to human
intelligence, these methods lack versatility and struggle to learn and adapt
quickly to new tasks, where labeled data is scarce. Meta-learning aims to solve
this problem by training a model on a large number of few-shot tasks, with an
objective to learn new tasks quickly from a small number of examples. In this
paper, we propose a meta-learning framework for few-shot word sense
disambiguation (WSD), where the goal is to learn to disambiguate unseen words
from only a few labeled instances. Meta-learning approaches have so far been
typically tested in an -way, -shot classification setting where each task
has classes with examples per class. Owing to its nature, WSD deviates
from this controlled setup and requires the models to handle a large number of
highly unbalanced classes. We extend several popular meta-learning approaches
to this scenario, and analyze their strengths and weaknesses in this new
challenging setting.Comment: Added additional experiment
Modeling brain activity associated with metaphor processing with distributional semantic models
In this study we investigate how lexical-semantic relations associated with the literal meaning (and abstract meaning) are being accessed across the brain during familiar metaphor comprehension. We utilize a data-driven whole-brain searchlight similarity-decoding analysis. We contrast decoding metaphoric phrases (âsheâs grasping the ideaâ) using distributional semantic models of the verb in the phrase (VERB model) versus that of the more abstract verb-sense (PARAPHRASE VERB model) obtained from literal paraphrases of the metaphoric phrases (âsheâs understanding the ideaâ). We showed successful decoding with the VERB model across frontal, temporal and parietal lobes mainly within areas of the language and default-mode networks. In contrast, decoding with the PARAPHRASE VERB model was restricted to frontal-temporal lobes within areas of the language-network which overlapped to some extent with signiïŹcant decoding with the VERB model. Overall, the results suggest that lexical-semantic relations closely associated with the abstract meaning in metaphor processing are largely localized to language and amodal (multimodal) semantic memory systems of the brain, while those more associated with the literal meaning are processed across a distributed semantic network including areas implicated in mental imagery and social-cognitio
Examining Modularity in Multilingual LMs via Language-Specialized Subnetworks
Recent work has proposed explicitly inducing language-wise modularity in
multilingual LMs via sparse fine-tuning (SFT) on per-language subnetworks as a
means of better guiding cross-lingual sharing. In this work, we investigate (1)
the degree to which language-wise modularity naturally arises within models
with no special modularity interventions, and (2) how cross-lingual sharing and
interference differ between such models and those with explicit SFT-guided
subnetwork modularity. To quantify language specialization and cross-lingual
interaction, we use a Training Data Attribution method that estimates the
degree to which a model's predictions are influenced by in-language or
cross-language training examples. Our results show that language-specialized
subnetworks do naturally arise, and that SFT, rather than always increasing
modularity, can decrease language specialization of subnetworks in favor of
more cross-lingual sharing
Neural Character-based Composition Models for Abuse Detection
The advent of social media in recent years has fed into some highly
undesirable phenomena such as proliferation of offensive language, hate speech,
sexist remarks, etc. on the Internet. In light of this, there have been several
efforts to automate the detection and moderation of such abusive content.
However, deliberate obfuscation of words by users to evade detection poses a
serious challenge to the effectiveness of these efforts. The current state of
the art approaches to abusive language detection, based on recurrent neural
networks, do not explicitly address this problem and resort to a generic OOV
(out of vocabulary) embedding for unseen words. However, in using a single
embedding for all unseen words we lose the ability to distinguish between
obfuscated and non-obfuscated or rare words. In this paper, we address this
problem by designing a model that can compose embeddings for unseen words. We
experimentally demonstrate that our approach significantly advances the current
state of the art in abuse detection on datasets from two different domains,
namely Twitter and Wikipedia talk page.Comment: In Proceedings of the EMNLP Workshop on Abusive Language Online 201
MetaVR: Understanding metaphors in the mind and relation to emotion through immersive, spatial interaction
Metaphorical thinking acts as a bridge between embodiment and abstraction and helps to flexibly organize human knowledge and behavior. Yet its role in embodied human-computer interface de- sign, and its potential for supporting goals such as self-awareness and well-being, have not been extensively explored in the HCI community. We have designed a system called MetaVR to support the creation and exploration of immersive, multimodal, metaphoric experiences, in which peopleâs bodily actions in the physical world are linked to metaphorically relevant actions in a virtual reality world.
As a team of researchers in interaction, neuroscience, and linguistics, we have created MetaVR to support research exploring the impact of such metaphoric interactions on human emotion and well-being. We have used MetaVR to create a proof-of-concept interface for immersive, spatial interactions underpinned by the WELL-BEING is VERTICALITY conceptual mappingâthe known association of âgoodâ=âupâ and âbadâ=âdownâ. Researchers and developers can currently interact with this proof of concept to configure various metaphoric interactions or personifications that have positive associations (e.g., âbeing like a butterflyâ or âbeing like a flowerâ) and also involve vertical motion (e.g., a butterfly might fly upwards, or a flower might bloom upwards). Importantly, the metaphoric interactions supported in MetaVR do not link human movement to VR actions in one-to-one ways, but rather use abstracted relational mappings in which events in VR (e.g., the blooming of a virtual flower) are contingent not merely on a âcorrectâ gesture being per- formed, but on aspects of verticality exhibited in human movement (e.g., in a very simple case, the time a personâs hands spend above some height threshold).
This work thus serves as a small-scale vehicle for us to re- search how such interactions may impact well-being. Relatedly, it highlights the potential of using virtual embodied interaction as a tool to study cognitive processes involved in more deliberate/functional uses of metaphor and how this relates to emotion processing. By demonstrating MetaVR and metaphoric interactions designed with it at CHI Interactivity, and by offering the MetaVR tool to other researchers, we hope to inspire new perspectives, dis- cussion, and research within the HCI community about the role that such metaphoric interaction may play, in interfaces designed for well-being and beyond
Scientific and Creative Analogies in Pretrained Language Models
This paper examines the encoding of analogy in large-scale pretrained
language models, such as BERT and GPT-2. Existing analogy datasets typically
focus on a limited set of analogical relations, with a high similarity of the
two domains between which the analogy holds. As a more realistic setup, we
introduce the Scientific and Creative Analogy dataset (SCAN), a novel analogy
dataset containing systematic mappings of multiple attributes and relational
structures across dissimilar domains. Using this dataset, we test the
analogical reasoning capabilities of several widely-used pretrained language
models (LMs). We find that state-of-the-art LMs achieve low performance on
these complex analogy tasks, highlighting the challenges still posed by analogy
understanding.Comment: To be published in Findings of EMNLP 202
A Comparison of Architectures and Pretraining Methods for Contextualized Multilingual Word Embeddings
The lack of annotated data in many languages is a well-known challenge within
the field of multilingual natural language processing (NLP). Therefore, many
recent studies focus on zero-shot transfer learning and joint training across
languages to overcome data scarcity for low-resource languages. In this work we
(i) perform a comprehensive comparison of state-ofthe-art multilingual word and
sentence encoders on the tasks of named entity recognition (NER) and part of
speech (POS) tagging; and (ii) propose a new method for creating multilingual
contextualized word embeddings, compare it to multiple baselines and show that
it performs at or above state-of-theart level in zero-shot transfer settings.
Finally, we show that our method allows for better knowledge sharing across
languages in a joint training setting.Comment: 7 pages, 6 figure
Joint Modelling of Emotion and Abusive Language Detection
The rise of online communication platforms has been accompanied by some
undesirable effects, such as the proliferation of aggressive and abusive
behaviour online. Aiming to tackle this problem, the natural language
processing (NLP) community has experimented with a range of techniques for
abuse detection. While achieving substantial success, these methods have so far
only focused on modelling the linguistic properties of the comments and the
online communities of users, disregarding the emotional state of the users and
how this might affect their language. The latter is, however, inextricably
linked to abusive behaviour. In this paper, we present the first joint model of
emotion and abusive language detection, experimenting in a multi-task learning
framework that allows one task to inform the other. Our results demonstrate
that incorporating affective features leads to significant improvements in
abuse detection performance across datasets.Comment: Proceedings of the 58th Annual Meeting of the Association for
Computational Linguistics, 202
Multilingual and cross-lingual document classification: A meta-learning approach
The great majority of languages in the world are considered under-resourced
for the successful application of deep learning methods. In this work, we
propose a meta-learning approach to document classification in limited-resource
setting and demonstrate its effectiveness in two different settings: few-shot,
cross-lingual adaptation to previously unseen languages; and multilingual joint
training when limited target-language data is available during training. We
conduct a systematic comparison of several meta-learning methods, investigate
multiple settings in terms of data availability and show that meta-learning
thrives in settings with a heterogeneous task distribution. We propose a
simple, yet effective adjustment to existing meta-learning methods which allows
for better and more stable learning, and set a new state of the art on several
languages while performing on-par on others, using only a small amount of
labeled data.Comment: 11 pages, 1 figur
- âŠ